330  3-j--^ 

B385 

1993:105   COPY   2 

Faculty  working  r aper  y>uiu:> 


X.      ^?,    *V 

On  Some  Heteroskedasticity-Robust  Estimators 
of  Variance-Covariance  Matrix 


Anil  K.  Bern  Totok  Suprayitno 

Department  of  Economics  Department  of  Economics 

University  of  Illinois  University  of  Illinois 


Bureau  of  Economic  and  Business  Research 

College  of  Commerce  and  Business  Administration 

University  of  Illinois  at  Urbana-Champaign 


BEBR 


FACULTY  WORKING  PAPER  NO.  93-0105 

College  of  Commerce  and  Business  Administration 

University  of  Illinois  at  Urbana-Champaign 

January  1993 


On  Some  Heteroskedasticity-Robust  Estimators 
of  Variance-Covariance  Matrix 


Anil  K.  Bera 
Totok  Suprayitno 

Department  of  Economics 


Digitized  by  the  Internet  Archive 

in  2012  with  funding  from 

University  of  Illinois  Urbana-Champaign 


http://www.archive.org/details/onsomeheterosked93105bera 


This  version:  January  1993 


ON  SOME  HETEROSKEDASTICITY-ROBUST  ESTIMATORS 
OF  VARIANCE-COVARIANCE  MATRIX 

Anil  K.  Bera  and  Totok  Suprayitno* 
University  of  Illinois  at  Urbana- Champaign 

ABSTRACT 

Chesher  and  Jewitt  (1987)  demonstrated  that  White's  (1980)  consistent  estimator  of  the  variance-covariance 
matrix  in  heteroskedastic  models  could  be  severely  biased  if  the  design  matrix  is  highly  unbalanced.  In  this 
paper  we,  therefore,  reconsider  the  Rao  (1970)  minimum  norm  quadratic  unbiased  estimator  (MINQUE). 
We  derive  the  analytical  expressions  for  the  mean  square  errors  (MSE)  of  White's  (1980),  one  of  MacKinnon 
and  White's  (1985)  and  MINQU  estimators,  and  perform  a  numerical  comparison.  Our  analysis  shows 
that  although  MINQUE  is  unbiased  by  construction,  it  has  very  large  variance  particularly  for  the  highly 
unbalanced  design  matrices.  Since  the  variance  is  the  dominant  factor  in  our  MSE  computation,  MINQUE 
is  not  the  preferred  estimator  in  terms  of  MSE  comparison.  We  also  studied  the  finite  sample  behavior 
of  the  confidence  interval  of  regression  coefficients  in  terms  of  coverage  probabilities  based  on  different 
variance-covariance  matrix  estimators.  Our  results  indicate  that  although  MINQUE  generally  has  the 
largest  MSE,  it  performs  relatively  well  in  terms  of  coverage  probabilities.  Overall,  taking  both  MSE 
and  coverage  probabilities  as  choice  criteria,  the  'almost  unbiased'  estimator  suggested  in  MacKinnon  and 
White  (1985)  is  the  winner. 


1.  Introduction 


When  the  disturbance  process  in  a  regression  model  exhibits  heteroskedasticity,  the 
invalidity  of  standard  inference  procedures  stems  from  the  wrong  estimation  of  the  stan- 
dard errors.  A  conventional  way  of  overcoming  this  problem  in  econometric  modeling  is 
to  specify  the  model  under  an  assumed  error  structure,  and  apply  Aitken's  weighted  least 
squares.  This  method  does  not  seem  to  be  attractive  to  many  practitioners  as  usually  there 
is  very  little  or  no  guidance  regarding  the  form  of  heteroskedasticity.  White  (1980)  pro- 
posed an  estimator  of  variance-covariance  matrix  of  the  least  squares  regression  coefficients 
which,  under  certain  conditions,  is  consistent  under  heteroskedasticity.  Other  attractive 
features  of  this  estimator  are  that  it  is  obtained  without  specifying  the  structural  form  of 
heteroskedasticity,  and  it  is  very  easy  to  compute.  This  may  explain  the  reason  behind  its 
popularity  in  applied  econometric  work. 
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Recently,  several  researchers  criticized  the  widespread  acceptance  of  White's  proce- 
dure, which  some  people  call  it  the  "White  washing".  Chesher  and  Jewitt  (1987)  showed 
analytically  that  for  certain  regression  designs  the  estimator  exhibits  a  large  bias  even  in 
large  sample.  In  particular,  a  severe  bias  arises  when  there  is  a  large  value  of  point  leverage 
of  the  regression  design,  rendering  inferences  drawn  from  this  estimator  uninformative.  A 
Monte  Carlo  study  conducted  by  Mishkin  (1990)  also  indicated  that  the  use  of  White's 
standard  errors  cannot  always  correct  the  inferences,  and  in  some  situations  can  make 
things  even  worse. 

Alternatives  to  the  heteroskedasticity-consistent  variance  estimator  are  available.  A 
close  variant  of  White's  estimator  is  the  one  suggested  by  MacKinnon  and  White  (1985). 
They  considered  an  estimator  based  on  the  unreplicated  "almost  unbiased  estimator"  of 
Horn,  Horn  and  Duncan  (1975).  This  estimator  is  unbiased  when  there  is  no  heteroskedas- 
ticity,  but  is  biased  if  the  homoskedastic  assumption  is  not  satisfied.  In  the  special  case  of 
balanced  regression  designs,  it  reduces  to  the  estimator  considered  by  Hinkley  (1977),  which 
differs  from  White's  estimator  only  by  some  proportional  constant.  Other  alternatives  in- 
clude those  based  on  minimum  norm  quadratic  estimation  (MINQUE)  principle  of  Rao 
(1970),  resampling  method  of  Wu  (1986)  and  maximum  likelihood  estimation  of  Hartley 
and  Jayatillake  (1973).  Some  of  the  extensions  to  a  more  general  case  where  the  distur- 
bances are  also  serially  correlated  are  provided  by  Newey  and  West  (1987),  Wooldridge 
(1989)  and  Andrews  (1991). 

In  this  paper  we  reconsider  the  MINQUE  principle  to  obtain  an  unbiased  estimator 
for  variance-covariance  matrix  under  heteroskedasticity.  The  paper  proceeds  as  follows. 
In  Section  2  we  provide  some  review  of  White's  consistent  estimator,  highlighting  its  bias 
and  indicating  a  simple  way  to  eliminate  the  bias.  In  Section  3  we  discuss  the  MINQUE 
procedure  within  the  framework  of  a  variance  component  model.  The  exact  expressions 
for  the  finite  sample  variance  of  different  estimators  are  derived  in  Section  4,  and  some 
numerical  and  Monte  Carlo  results  are  given  in  Section  5.  The  last  section  provides  a 
conclusion. 


2.  The  Bias  of  White's  Heteroskedasticity-Consistent  Estimator 

We  consider  the  standard  regression  model 

y  =  X/3  +  e,  (1) 
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where  y  is  a  (n  x  1)  vector  of  dependent  variables,  X  is  a  (n  x  k)  matrix  of  independent 
variables,  /?  is  a  (k  X  1)  vector  of  unknown  parameters,  and  e  is  a  (n  X  1)  random  vector 
with  mean  zero  and  variance-covariance  matrix  E  =  dia,g(a2, . . .  ,cr\).  Under  this  setup, 
the  ordinary  least  squares  (OLS)  estimator  of  (3  is  given  by 

$  =  {X'XTlX'y,  (2) 

and  its  true  variance-covariance  matrix  is 

QT  =  {X'X)-lX,llX{X,Xyl.  (3) 

Under  homoskedasticity  assumption  e  ~  (0,cr2/)  this  variance-covariance  matrix  is  esti- 
mated by 

ns  =  *2(x'x)-\  (4) 

where  a2  is  the  standard  OLS  estimator  of  a2.  This  latter  estimator  is  inconsistent  if  in 
fact  the  disturbances  are  heteroskedastic. 

White's  (1980)  heteroskedastic-consistent  estimator  is  given  by  [see  also  Eicker  (1963)] 

Slw  =  {X'X)-lX'tX(X'X)-1  (5) 

where  S  =  diag(e|, . . .  ,£^),  with  £t  being  the  OLS  residual.  Note  that  this  S  is  similar 
to  the  unreplicated  J.N.K  Rao's  (1972)  modified  MINQUE  or  the  unreplicated  average  of 
squared  residuals  of  Horn,  Horn  and  Duncan  (1975).  Different  from  the  traditional  ways 
of  overcoming  the  heteroskedasticity  problem  in  econometrics  literatures,  the  Q,\y  does  not 
require  specification  of  the  particular  form  of  heteroskedasticity. 

Under  the  regularity  conditions  given  in  White  ( 1980),  Qjy  is  a  consistent  estimator  for 
Q.T,  but  it  is  generally  biased  under  both  homoskedastic  and  heteroskedastic  disturbances. 
Following  Chesher  and  Jewitt  (1987),  let  us  define  H  =  XiX'X^X',  M  =  I  -  H,  h2  is 
the  z'-th  column  of  matrix  if,  rrij  is  the  z'-th  column  of  matrix  M,  and  htJ  is  the  (z,  j)-th 
element  of  matrix  H .  We  have  i  =  (fi, . . .  ,£n)'  =  Me  and  therefore, 

n 

E{i2 )  =  miSm,  =  a;  -  2hlla2  +  ^  h2tJa2 

=  <7?-2h;h,<7?+h;Sh„  (6) 

since  H  is  an  idempotent  matrix.  The  bias  of  £2  then  is  given  by 

bias(ff)  =  £(£?)  -a2 

=  h;Ehl-2h;hl^, 

=  hJCE  -  2<7?/)hi5  (7) 


and  the  bias  of  White's  consistent  estimator  Q\y  is 


•/  v\-i 


bias(ftn')  =  (X'X)-lX'BX(X'X) 


(8) 


where  B  —  diag  (hj(S  —  2(j\  J)hi,. . .  ,hn(E  ~"  2o,J/)hB)1  which  is  not  zero  under  both 
homoskedastic  and  heteroskedastic  disturbances.  Obviously,  when  max;(<7?)  <  2min,-(of ) 
for  all  t,  all  elements  of  B  are  negative,  and  therefore  the  standard  error  of  all  elements  of 
/?  would  be  underestimated. 

When  the  disturbances  are  homoskedastic,  the  bias  of  the  White's  consistent  estimator 

will  be  -(72(X'X)-lX'[dmg(hn hnn)}X(X'X)-1 .    Horn,  Horn  and  Duncan  (1975) 

proposed  £j  /(l  —  ha)  as  an  almost  unbiased  estimator  (AUE)  for  of,  which  was  then  used 
by  MacKinnon  and  White  (1985)  to  modify  the  White's  consistent  estimator.  MacKinnon 
and  White's  (1985)  estimator  can  be  written  as 


n 


MVV 


=  (X'X)-1X'HX(X,X) 


-1 


(9) 


where  E  =  diag  (e?/(l  —  /in),...  ,£n/(l  —  hnn)).  This  estimator  is  of  course  unbiased 
only  when  the  disturbances  are  homoskedastic.  In  the  special  case  of  a  balanced  design 
matrix  X,  where  ha  =  k/n  for  all  i,  Qmw  reduces  to  {n/(n  —  k))$lw,  which  is  the  variance- 
covariance  matrix  estimator  suggested  by  Hinkley  (1977).  Both  MacKinnon  and  White's 
(1985)  and  Hinkley's  (1977),  however,  are  biased  when  the  disturbances  are  heteroskedas- 
tic. 

Given  the  relation  between  e  and  e,  the  derivation  of  unbiased  version  of  the  White's 
estimator  for  general  design  matrix  X  is  straightforward.  Let  us  denote  by  rriij  the  (i,  ji)-th 
element  of  matrix  M.  Then,  from  (6),  we  have 


£(e1)  =  £ 


m\}a) 


;=i 


which  can  be  expressed  as 


for  i  =  1, 


,n, 


(10) 
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E(e)  =  Qa(~' ,      say. 


Therefore,  <7^2)  =  (<72,<72,. . .  ,  o2n )'  =  Q~le  is  an  unbiased  estimator  of  a^  if  Q  is  non- 
singular,  and  our  unbiased  version  of  the  White's  estimator  can  be  obtained  by  putting 
the  z-th  element  of  <7(2)  instead  of  i]  in  the  expression  for  Qvy.  It  is  interesting  to  see  that 
<72  turns  out  to  be  exactly  MINQUE  of  a2  proposed  by  Rao  (1970),  as  demonstrated  in 
the  following  section. 

3.   MINQUE  of  Variance-Covariance  Matrix 
in  Heteroskedastic  Linear  Models 


We  write  the  disturbance  process  £  of  (1)  in  an  identity  similar  to  the  variance  com- 
ponents type  model  as  follows, 

e  =  uiei  +---  +  unen,  (12) 

where  u;  [i  =  1,.. .  ,n)  is  a  (n  x  1)  known  vector  whose  i-th  element  is  one  and  the 
rest  are  zero.  Since  var(tj )  =  of  (i  —  1, . . . ,  n),  the  variance- covariance  matrix  of  e  is  given 
by  E  =  diag(cr2, . . . ,  a2n )  which  is  exactly  the  heteroskedastic  problem  we  usually  consider. 
Rao  (1970)  approached  precisely  this  problem  in  a  somewhat  different  way,  and  obtained 
MINQUE  of  a2.  We  will  see  that  variance  component  representation  of  the  disturbance 
process  above  arrives  at  the  same  result  and  provides  a  more  convenient  way  of  obtaining 
MINQUE  of  heteroskedastic  variances.  This  variance  component  framework  is  also  useful 
for  analyzing  various  forms  of  heteroskedasticity. 

Turning  back  to  our  present  problem,  our  interest  is  to  obtain  the  MINQUE  of  k  x  k 
variance- covariance  matrix  Q,t  —  [X' X)~l X'HX(X' X)~l .  Denoting  by  XXJ  the  (z,j)-th 
element  of  (.Y'.Y)-1  A''  for  i  —  1, . . . , k  and  j  =  1, . . .  ,n,  the  (r,  .s)-th  element  of  Qt  may 
be  written  as  a  linear  combination  of  a2,  (j  =  1, . . . ,  n) 

n 

QTJr>s)  =  )    XrjXsjO* ,         r,  s  =  1, . . . ,  k 

=  <s°(2)  (13) 

where  a'r3  =(Xr\Xai, ¥rnXsn). 

It  is  easy  to  see  that  the  MINQUE  of  Qj-(r,5)  can  be  obtained  directly  from  Rao 
(1972).  Let  us  write  \\  =  u,u'(,  then  £"=1  Vt  =  I.  The  MINQUE  of  Q,T(r,s)  is  given  by 
y'Ay,  with  .4  satisfying 

min tr  {AA)  subject  to  AX  =  0  and  tv(AVt )  =  XriX3t,      i  —  1, 2, . . . , n,  (14) 

A 


m\l 

m\2     . 

■•      m\n~ 

Q  =  M  *  M  = 

m21 

m\2     . 

•  •       ™2n 

-mnl 

m2n2     . 

■•       mln- 

where  tr(-)  denotes  a  trace  of  a  matrix.  The  above  two  restrictions  impose  invariance  and 
unbiasedness.  The  solution  to  (14)  is  given  by 

n 

A*  =  J2x*MV*Mi  (15) 

where  A'  =  (Ai ,  A2, . . . ,  An)  satisfies 

A'Q  =  a'r3,  (16) 

with  Q  =  [tr  (MVlMVJ )].  Here  we  write  Ja,jJ  to  denote  a  matrix  whose  (i,  j)-th  element  is 
a,j.  Some  simple  algebra  shows  that  Q  is  the  Hadamard  product  of  matrix  M;  explicitly, 


(17) 


where  mtJ  is  the  (z,  j)-th  element  of  M. 

Now,  let  us  denote  by  Qm(^-s)  the  MINQUE  of  ^^(r,^)  =  a'rs<7(2).  It  is  given  by 

n  n  n 

SIm{t,s)  =  y'A*y  =  £  \ty'MVtMy  =  J]  Xl£'Vti  =  £  \te}  =  A'e.  (18) 

1=1  1=1  1=1 

If  Q  is  non-singular, 

QM(r,^)  =  a'r3Q-1e.  (19) 

A  contrast  between  MINQUE  and  White's  consistent  estimator  of  Q,t  can  be  made  in 
terms  of  their  weighting  scheme.  Each  MINQUE  of  aj  is  estimated  by  a  linear  combination 
of  squares  of  all  the  OLS  residuals  with  the  weights  are  related  to  the  design  matrix 
-Y.  In  White's  approach,  on  the  other  hand,  all  the  weights  are  given  to  the  respective 
OLS  squared  residual,  i.e,  af  is  estimated  by  i"\  for  all  i.  Clearly,  White's  estimator 
is  always  biased  unless  matrix  H  =  A'(A''-Y)_1.Y'  =  \hij\  is  zero.  Chesher  and  Jewitt 
(1987)  specifically  show  that  very  severe  bias  may  arise  when  there  are  large  /i,;,  and 
becomes  extreme  as  max(/iu)  approaches  1,  rendering  the  inferences  drawn  from  White's 
consistent  estimator  uninformative.  In  such  situation  the  MINQUE  may  be  useful  as  a 
point  estimator. 

The  obvious  problem  encountered  in  using  MINQUE  is  that  matrix  Q  may  be  singular. 
An  easily  verifiable  condition  for  nonsingularity  of  matrix  Q  is  given  by  Horn  and  Horn 
(1975),  namely, 

max(/?,, )  <  -         for     i  =  l,...,ra,  (20) 
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where  hu  =  1  —  mu  =  x^A'A')-^,  which  is  often  regarded  as  a  measure  of  points  of 
leverage  of  the  regression  design.  To  see  the  above  result  note  that  Q  is  nonsingular  if 
it  is  a  dominant  diagonal  matrix,  i.e.,  if  m2lt  >  Yll^i  mlj  f°r  a^  l  ~  1,2, ...  ,n  [see,  e.g., 
Graybill  (1983)  pp.  250-251].  Since  M  is  idempotent,  X^J=I  m2-  =  rau,  and  the  required 
condition  becomes  ma{\  —  2mtl)  <  0,  which  is  the  same  as  3hu  —  2h2t  <  1.  Note  that 
3hu  —  2h2t  is  a  monotonically  increasing  function  for  0  <  ha  <  |,  and  it  is  less  than  1  if 
ha  <  |.  Another  set  of  conditions  for  nonsingularity  of  matrix  Q  is  also  suggested  by  Rao 
(1970),  but  it  is  apparently  not  easy  to  verify  for  economic  data  where  we  usually  have 
quite  complicated  regression  design. 

The  condition  (20)  is  sufficient  but  not  necessary.  For  example,  consider  the  following 
regression  design 

X'  = 

The  matrix  Q  =  {I  —  X(X'X)~1X')  *  (I  —  A"(A"'A')-1  A'')  is  nonsingular  even  though 
maxl=ii...i6  hlx  =  0.8.  A  set  of  necessary  and  sufficient  conditions  is  proposed  by  Mallela 
(1972)  as  follows.  Let  A*i  be  a  set  of  k  linearly  independent  rows  of  A',  and  X2  be  the  set 
of  X  complementary  to  A"i  in  A.  Define  Z  =  A^A'j-1,  and  let  z2  be  the  i-th  row  of  Z. 
Further,  let  R  be  the  ((??  —  k)(n  —  k  —  l)/2)  x  k  whose  rows  are  the  Hadamard  product 
Zj  *  Zj  for  i  <  j ,  i,  j  =  1,2, ...  ,n  —  k.  Then  Q  is  nonsingular  if  and  only  if  the  rank  of  R 
is  equal  to  k.  This  set  of  conditions,  however,  is  neither  simple  nor  easy  to  interpret  and 
may  be  computationally  burdensome. 

Another  drawback  of  MINQUE  procedure  is  the  possibility  of  getting  some  negative 
estimates  of  individual  variances  a2.  A  common  suggestion  when  this  problem  arises  is  to 
apply  an  ad  hoc  procedure  by  replacing  the  non-positive  values  by  a  small  positive  number, 
resulting  a  truncated  MINQUE 

a2     if  <rf  >  0 

Si       if  cr2  <  0, 

where  ^t's  are  some  small  numbers  guaranteed  to  be  greater  than  zero.  This  modification, 
of  course,  destroys  the  unbiasedness  property  of  MINQUE  and  may  not  be  theoretically 
justified. 

4.  The  Finite  Sample  Variance  of  the  Estimators 

Given  the  seriousness  of  bias  in  White's  estimator  it  is  natural  to  ask  how  the  unbiased 


~  2 
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estimator  MINQUE  will  perform  in  terms  of  variance.  It  is  expected  that  MINQUE  will 
have  higher  variance.  As  a  compromise,  we  will  later  use  mean  square  error  (MSE),  which 
combines  bias  and  variance,  as  a  criterion  to  compare  different  estimators.  From  Section 
2  biases  of  all  variance  estimators  can  be  easily  obtained.  In  this  section  we  derive  the 
variances  of  Qs{r,$),  &w{r, $),  QA(r,s)  and  HM(r^)i  respectively  corresponding  to  the 
standard  OLS,  White's  (1980)  consistent,  almost  unbiased  MacKinnon  and  White  (1985) 
and  MINQU  estimators  of  the  (r,s)-th  element  of  the  true  OLS  variance  r2^(r,  5).  We 
assume  the  disturbance  vector  e  is  normally  distributed  with  mean  zero  and  variance 
diag( crj, ...  ,<7^).  To  simplify  the  derivation,  we  first  note,  as  we  mentioned  earlier,  that 
the  algebra  of  those  estimators  differ  from  each  other  only  in  their  weighting  scheme  of  the 
squared  OLS  residuals.  Specifically,  all  but  the  standard  OLS  variance  estimator  can  be 
written  as 

ft(r,s)  =  a'raWe,  r,s  =  1, . . . ,  k,  (21) 

where  W  is  a  (n  x  n)  weighting  matrix.  For  the  White  (1980),  MacKinnon  and  White 
(1985)  and  MINQU  estimators,  W  corresponds  to  Inxn,  diag(l/mn, . . . ,  l/rann)  and  Q~ 1, 
respectively.  The  standard  OLS  variance  estimator  is  the  special  case  for  which  ar5  is  the 
scalar  corresponding  to  the  (r,  s)-th  element  of  (X'X)'1  and  W  is  (nxl)  vector  of  \/{n  —  k). 
Under  the  representation  (21),  the  variance  of  Q(?\ 3)  is  given  by 

var(fi(r,,s))  =  a'rsW  V(e)  War„  (22) 

where  V(e)  denotes  the  variance-covariance  matrix  of  e.  It  can  be  shown  that  V(e)  has 
the  following  typical  elements  (see  the  Appendix  for  derivation): 

n 

var(f?)  =  2(^m^)2  =  27ll,  (23) 

n 

cov(e2,£2)  =  2(]Tmltmit(7t2)2  =  2liy  (24) 

Let  us  denote  by  q1^  the  (t,j)-th  element  of  matrix  Q~l .     Then  the  variance  of 
MINQUE  nM(r,s)  is 

n        n        n        n 

var(fiM ( r,  s) )  =  2  ]T  J^  H  H  *«■*«  XrjXsj  qlfqjg  lfg  (25) 

j=l  j  =  \  f=\  g=l 

for  r,  s  =  1, . . . ,  k.  If  q11  =  l/m„  and  q1J  =  0  for  1  ^  j,  i,j  =  1,  2, ...  n,  then  it  reduces  to 
the  variance  of  ft  .4(7%  s) 

var(^(r,,))  =  2 T T  X"X°*XnX'J  T  (26) 

£j  jTi       rnumjj 

S 


Variance  of  White  estimator  Q,w{i\  s)  is  the  special  case  when  q11  —  1  and  qlJ  =  0  for 
i  jz  j,  z\  j  =  1,  2, . . .  ,  n;  it  is  simply 

n        n 

var(  0, w(r,  s))  =  2  ^  ^  #h Xsi  Xrj XSJ  7^ .  (27) 

1=1  j=\ 

More  trivially,  the  variance  of  the  standard  OLS  estimator  Qs(r^)  is  given  by 

var(ft5(r,6))=a9;svar(a2),  (28) 

where  a2  is  the  usual  OLS  estimator  of  variance  of  e%  under  homoskedasticity  assumption 
and  ars  is  the  (r,s)-th  element  of  (A''A')_1.  Under  heteroskedasticity,  var((72)  is  of  the 
form 

_  n        n 

V^^=T^]^T.T^-  (») 

1  ;    t=l  J=l 

Note  that  because  M  =  \piij\  is  idempotent,  Y^i=\  J21  =1(^^=1  rnitrnjt)2  —  Y!n=i  m"  = 
n  —  k.  Therefore,  when  in  fact  a]  —  a2  for  all  z, 


var((7" )  = 


2a4 


n  —  k' 
which  is  the  standard  formula  under  the  homoskedasticity. 

The  first  three  variances  have  very  similar  algebraic  expressions.  Obviously,  since 
0  <  mlt  <  1,  the  variances  of  White's  consistent  estimator  will  never  exceed  those  of 
MacKinnon  and  White's.  In  the  special  case  of  a  balanced  experimental  design  for  which 
mtt  =  (n  —  k)/n,  they  differ  only  by  a  proportional  constant.  An  analytical  comparison 
with  the  MINQUE  variance,  however,  seems  to  be  difficult  due  to  the  complicated  nature 
of  the  weight  qlJ .  For  a  comparison  of  those  four  variances,  we  will  do  a  simple  numerical 
exercise  for  given  design  matrices  with  different  sample  sizes  and  leverage  points. 

5.  A  Numerical  Exercise 


To  the  best  of  our  knowledge,  a  numerical  comparison  on  the  MSE  of  those  estima- 
tors based  on  their  exact  expression  has  not  been  done  before.  Since  we  have  the  exact 
expression  of  the  variance  of  those  estimators,  no  sampling  experiment  is  required  for  the 
MSE  comparison.  However,  to  study  the  finite  sample  behavior  of  the  confidence  inter- 
vals of  regression  coefficients  in  terms  of  coverage  probabilities,  we  carry  out  a  Monte 


Carlo  study.  Related  simulation  studies  have  been  done  previously  by  MacKinnon  and 
White  (1985)  and  Nanayakkara  and  Cressie  (1991).  MacKinnon  and  White  (1985)  did 
not  consider  MINQUE  and  concentrated  on  the  behavior  of  ^-statistic  based  on  different 
variance-covariance  matrix  estimators.  They  also  did  not  experiment  with  different  de- 
sign matrices.  Nanayakkara  and  Cressie  (1991)  studied  mainly  the  coverage  probability 
of  confidence  intervals  and  did  not  include  MINQUE  in  their  study  when  the  regression 
model  has  an  intercept.  And  also  they  used  a  'well-  behaved'  design  matrix  in  their  sim- 
ulation. Here  we  use  less  well-behaved  design  matrices  so  as  to  allow  an  investigation  of 
the  performance  of  each  estimator  for  different  nature  of  such  design  matrices.  Therefore, 
our  numerical  exercise  could  be  viewed  as  a  complimentary  to  the  simulation  studies  of 
the  above  two  papers.  Our  experiment  is  based  on  a  linear  regression  model  specified  as 
follows: 

where  we  set  parameter  /31  =  (10.0,3.5,2.5).  The  first  regressor  x\  was  obtained  from 
independent  log-normal  pseudo-random  numbers  generator  with  the  corresponding  normal 
variable  has  mean  zero  and  variance  unity.  That  is,  xu  =  eZi  where  zx  is  a  vV(0, 1)  variable. 
The  regressor  X2  is  simply  a  A'(2, 1)  variable.  The  choice  of  the  distributions  here  is  to 
generate  quite  high  enough  point  of  leverage  in  the  design  matrix,  so  we  could  examine  its 
effect  on  the  behavior  of  estimators  under  consideration.  In  this  experiment  all  pseudo- 
random ;V(0,  1)  variables  were  generated  by  the  IMSL  routine  RNNOA,  and  the  log-normal 
variable  were  generated  directly  from  the  routine  RNLNL. 

To  investigate  the  effect  of  heteroskedasticity  on  the  performance  of  the  estimators  we 
assumed  a  certain  structure  of  the  disturbances.  In  particular,  we  assumed 

£i  =  (7jt>t, 

where  cr,  is  a  function  of  non-stochastic  variables  and  V{  is  a  white  noise  process.  The  vari- 
ance of  £,  then  is  <7^,  which  determines  the  nature  of  the  heteroskedasticity  depending  on 
its  prespecified  functional  forms.  In  our  experiment  we  considered  two  different  models  of 
heteroskedasticity.  First,  we  assumed  the  heteroskedasticity  was  induced  by  the  regressors 
according  to 

Model  1:  o]  =  Xq  +  XiXu  +  A2X2l. 

The  second  structure  of  heteroskedasticity  was  specified  as 

Model  2:  a]  =  00  +  0i«i  +  #2"?, 
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where  ux  was  drawn  independently  from  jV{0,  1).  The  experiment  was  conducted  for  1000 
replications  with  xu,  X21  and  ut  held  fixed  in  each  replication. 

For  each  structure  we  carried  out  the  experiments  by  varying  degree  of  heteroskedas- 
ticity,  which  can  be  easily  accomplished  by  selecting  different  values  of  A'  =  (Aq,Ai,A2) 
and  8'  =  {80,81,82).  Following  Chesher  and  Jewitt  (1987),  we  measure  the  degree  of 
heteroskedasticity  by  the  ratio  max,(of )/  min,-(of).  The  value  of  1  for  this  ratio  indi- 
cates homoskedasticity  and  the  values  of  greater  than  1  correspond  to  the  presence  of 
heteroskedasticity.  Four  different  sets  of  A'  =  (Ao,  Ai,  A2)  and  8'  =  (80,81,82)  were  chosen. 
For  the  first  set  we  took  A'  =  (20.0,0.0,0.0)  and  8'  =  (20.0,0.0,0.0),  corresponding  to  the 
homoskedastic  case  where  the  standard  OLS  variance  estimator  is  appropriate.  This  case 
was  considered  to  check  the  cost  of  using  the  alternative  variance-  covariance  estimators 
when  in  fact  there  is  no  heteroskedasticity.  In  the  second,  third  and  fourth  set  of  exper- 
iments we  took  the  values  of  (20.0,  0.01,  3.5),  (20.0,0.01,7.0),  and  (20.0,0.01,10.5)  for  A, 
and  (20.0,0.01,10.0),  (20.0,0.01.20.0).  and  (20.0,0.01,30.0)  for  9,  producing  a  relatively 
moderate  to  a  very  severe  degree  of  heteroskedasticity. 

One  way  to  study  the  finite  sample  performance  of  an  estimator  is  to  use  the  MSE, 
which  in  some  sense  encompasses  both  the  bias  and  the  variance  of  the  estimator.  We 
carried  out  each  experiment  with  eight  different  sample  sizes  n  =  30,  40,  50, . . . ,  100.  In 
every  experiment,  for  each  sample  size,  we  calculated  the  MSE  of  each  estimator  from  the 
exact  expression  of  variances  we  derived  in  Section  4.  Note,  these  computations  does  not 
require  any  replication,  and  therefore,  are  not  subject  to  sampling  error. 

As  we  mentioned  earlier,  the  MINQUE  procedure  may  produce  negative  estimates  of 
<7j2  and  even  negative  estimates  of  the  variance  of  fi.  Given  this  situation  we  also  considered 
a  truncated  MINQUE  with  each  negative  estimate  of  aj  being  replaced  by  if/{l  —  ha), 
where  i{  is  the  OLS  residual  and  hu  is  the  diagonal  element  of  the  hat  matrix  X(X'X)~1X' . 
Since  the  algebraic  expression  of  this  estimator  is  difficult  to  obtain,  we  estimated  the 
bias,  the  variance  and  the  MSE  of  the  estimator  from  the  sampling  experiment.  For 
each  experiment  and  sample  size,  let  vicr  and  Vkr  denote  the  true  and  the  estimate  of  the 
variance  of  the  OLS  estimator  j3k  in  the  r-th  replication,  respectively.  The  bias  of  the 
variance  estimator  of  ftk  was  estimated  by  the  average  bias;  we  denote  this  as 


1000  . „  . 

bTas  =  £ ( 


1000 
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The  variance  and  the  MSE  were  estimated  by 

(Vjfcr  -  Ujfc) 


VAR  =  £ 


1000  ,„  _    n9 

{Vkr  ~  Vk) 

1000 

r=l 


1000 

r=l 

and 

1000  ,  „  xo 

MSE  =  L         1000        ' 

r=l 

respectively,  where  Vk  is  the  sample  average  of  all  ujfer.  For  comparison,  we  also  carried  out 
the  computations  for  all  other  procedures. 

Since  the  variance-covariance  matrix  estimators  are  usually  used  for  statistical  infer- 
ences, it  is  important  to  study  the  performance  of  each  estimator  in  terms  of  statistical 
inference.  In  our  experiment  we  used  the  estimators  to  estimate  the  confidence  inter- 
val coverage  probabilities  for  the  regression  parameters.  We  assumed  the  distribution  of 
/W\/var(/3jfc)  is  Student's  t,  where  fik  {k  =  1,2)  is  the  OLS  estimator  of  0k  under  the 
complete  model  and  var(/3jt)  is  tne  variance  of  /?;.  in  a  given  procedure.  For  each  procedure 

we  estimated  Pr(/3fc  —  £Q/2\/var(/3jfc)  <  0k  <  0k  +  *a/2Vvar(A-)),  where  tQ/2  is  the  a/2-th 
quantile  of  Student's  t  distribution.  We  took  the  nominal  size  a  =  5%.  The  estimate  of 
this  probability  is  p  =  F/R,  where  F  is  the  observed  frequency  of  (3k  being  in  its  95% 
confidence  interval  and  R  is  the  number  of  replications.  Since  we  have  1000  replications, 
an  estimate  of  the  standard  error  of  p  is  y/p{l  —  p)/1000.  We  also  carried  out  this  com- 
putation for  a  procedure  which  utilizes  the  true  variance  of  /?.  In  this  particular  case,  ta/2 
is  the  a/2-th  quantile  of  the  standard  normal  distribution. 

Summary  of  the  results  of  our  experiments  on  the  bias,  the  variance  and  the  mean 
square  errors  is  presented  in  Tables  I  through  III.  Table  I  contains  some  information  for  ho- 
moskedastic  case,  while  Tables  II  and  III  present  the  results  for  the  case  of  heteroskedastic- 
ity  of  Model  1  for  sample  sizes  60  and  100,  respectively.  The  rests  are  not  reported  as  they 
are  generally  similar  to  those  presented  in  the  tables.  Also,  the  results  for  heteroskedas- 
ticity  of  Model  2  have  quite  similar  pattern  to  those  of  Model  1.  Information  contained  in 
these  tables  is  generally  self  explanatory.  Here  we  mention  only  some  interesting  points. 

In  Tables  I  through  III,  the  numbers  in  parentheses  in  those  tables  indicate  that  the 
results  were  estimated  from  the  sampling  experiment,  which  evidently  are  reasonably  close 
to  those  calculated  from  their  exact  expressions.  The  results  for  homoskedastic  case  in 
Table  I  shows,  as  one  should  expect,  that  the  OLS  performed  quite  well  in  terms  of  bias, 
variance  and  MSE.  As  we  observe  the  results  in  other  tables,  in  each  experiment  and  for  a 
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given  sample  size,  the  OLS  variance  estimator  generally  has  the  smallest  variance,  except 
for  a  few  cases  when  the  heteroskedasticity  is  very  strong.  Among  the  robust  procedures, 
the  pattern  of  their  variances  is  more  systematic.  As  it  is  expected,  the  variance  of  White's 
estimator  is  always  smaller  than  that  of  MacKinnon  and  White's  (MWE).  It  is  also  evident 
that  MINQUE  consistently  have  much  smaller  bias  but  larger  variances  than  the  other  three 
variance  estimators. 

The  MSE  results  for  heteroskedastic  cases  are  quite  surprising.  In  Tables  II  and  III 
we  present  only  for  sample  sizes  60  and  100;  the  other  results  are  very  similar.  In  Table 
II,  except  for  the  coefficient  0q,  the  OLS  estimator  generally  has  smallest  MSE,  even 
in  the  presence  of  strong  heteroskedasticity.  Such  surprising  results  are  more  prominent 
for  sample  size  100,  in  Table  III,  where  we  observe  that  MSE's  associated  with  OLS  are 
the  smallest  ones  for  all  /3's.  Based  on  these  results,  one  might  be  tempted  to  suggest 
that  OLS  is  the  estimator  of  choice  even  in  the  presence  of  heteroskedasticity.  Such  a 
conclusion,  however,  requires  cautions,  simply  because  MSE  may  not  be  an  appropriate 
criterion  to  characterize  an  estimator.  As  we  observe  from  those  tables,  the  MSE's  of 
the  robust  procedures  are  very  much  dominated  by  the  variance  part.  Consequently, 
an  MSE  comparison  is  essentially  a  variance  comparison,  and  whether  or  not  this  is  an 
appropriate  approach  remains  questionable.  In  the  subsequence  discussion  we  will  make 
bias  comparisons  across  different  design  matrices  (represented  by  different  sample  sizes) 
and  degrees  of  heteroskedasticity.  Also,  we  will  study  the  behavior  of  ^-statistic  associated 
with  each  procedure  through  simulated  confidence  interval  coverage  probabilities. 

Since  the  true  variance  of  (3  differs  across  design  matrices  and  degrees  of  heteroskedas- 
ticity (see  Table  VI),  a  meaningful  comparison  requires  some  adjustment  that  eliminates 
the  effect  of  these  differences.  In  Tables  IV  and  V  we  present  the  relative  bias  which  we 
define  as  the  ratio  between  the  bias  and  the  true  variance  of  (3.  The  sample  sizes  60  and  100 
in  Table  IV  are  to  represent  the  design  matrices  with  small  maximum  value  of  ha,  which 
is  regarded  as  a  measure  of  point  of  leverage,  while  the  sample  sizes  40  and  90  in  Table  V 
correspond  to  those  with  large  maximum  value  of  ha.  The  following  figures  describe  the 
nature  of  all  eight  design  matrices  in  terms  of  htl: 


Sample  Size 

30 

40 

50 

60 

70 

80 

90 

100 

max  (htt ) 

0.609 

0.666 

0.500 

0.266 

0.377 

0.417 

0.749 

0.274 

min  (ha) 

0.035 

0.025 

0.021 

0.017 

0.015 

0.013 

0.012 

0.011 

max(  hij ) 
min(/i,, ) 


17.604   26.320  24.380   15.952  26.007   32.795  64.603   26.095 
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Here  we  should  note  that  to  study  the  effects  of  different  degree  of  points  of  leverage, 
we  did  not  attempt  to  construct  design  matrices  for  various  sample  sizes  with  the  same 
measure  of  points  of  leverage  of  the  regression  design. 

General  observation  on  Tables  IV  and  V  reveals  an  obvious  result  that  OLS  performs 
well  when  disturbances  are  homoskedastic,  but  its  performance  gets  worse  and  worse  as  the 
degree  of  heteroskedasticity  increases.  White's  estimator  seems  to  possess  special  behavior. 
The  estimator  clearly  exhibits  a  large  bias  if  in  fact  the  disturbances  are  homoskedastic. 
The  downward  bias  is  of  course  guaranteed  in  this  case  since  the  degree  of  heteroskedasticity 
is  less  than  2.  It  is  quite  evident,  however,  that  White's  estimator  tend  to  underestimate 
the  true  variance  even  in  the  presence  of  strong  heteroskedasticity.  When  the  disturbances 
are  indeed  heteroskedastic,  even  though  the  biases  of  White's  estimator  are  smaller  than 
those  of  OLS  if  the  design  matrix  is  relatively  balanced,  it  is  not  true  if  the  design  matrix 
is  unbalanced.  Clearly,  White's  estimator  is  very  sensitive  to  the  presence  of  high  point  of 
leverage.  In  such  a  situation  the  performance  of  White's  estimator  is  no  better  than  that 
of  OLS,  and  it  is  even  worse  especially  for  inferences  regarding  the  coefficient  (3\. 

MacKinnon  and  White's  estimator  performs  quite  well  in  terms  of  its  bias.  Even 
though  there  is  evidence  that  it  is  also  sensitive  toward  the  unbalancedness  of  the  design 
matrix,  the  effect  is  not  as  severe  as  that  in  White's  estimator.  In  the  extreme  case  where 
both  max(/ij,)  and  max(/i,j)/min(/iji)  are  high  (sample  size  90),  the  relative  biases  are 
still  quite  severe,  but  they  are  small  enough  compared  with  those  of  OLS  and  White's 
estimators.  As  in  the  case  of  White's  estimator,  the  most  severe  effect  of  point  of  leverage 
is  on  the  variance  of  f3\  which  is  associated  with  the  regressor  contributing  most  to  the 
presence  of  high  point  of  leverage. 

From  the  sampling  experiments,  MINQUE  is  of  course  the  only  procedure  which 
consistently  produced  very  small  bias,  because  theoretically  its  bias  is  zero  irrespective 
of  the  nature  of  the  design  matrix  and  the  degree  of  heteroskedasticity.  However,  to 
achieve  its  unbiasedness  property,  MINQUE  seems  to  have  to  bear  the  cost  in  the  form  of 
producing  large  variance  and  negative  estimates  when  the  design  matrix  is  unbalanced. 
Our  results  also  indicate  that  the  large  maximum  value  of  h^  affects  the  variance  of 
MINQUE  corresponding  to  3\.  In  Table  I,  for  example,  the  variance  of  MINQUE  for 
j3\  explodes  in  sample  size  30.  The  same  is  also  true  for  sample  sizes  40  and  90.  Even 
though  the  variance  of  MWE  also  increases,  the  effect  does  not  seem  as  severe  as  in  the 
MINQUE  case.  The  ad  hoc  truncated  MINQUE,  denoted  by  MINQUE1  in  the  tables, 
which  is  obtained  by  replacing  any  negative  estimate  of  a]  by  e\ /(l  —  /i„),  exhibits  quite 
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large  bias  and  variance. 

Next,  we  study  the  performance  of  different  estimators  in  terms  of  coverage  probabili- 
ties. In  Figures  I  through  IV  we  present  the  estimates  of  95%  confidence  interval  coverage 
probability  for  fa  and  fa  for  different  sample  sizes.  They  illustrate  how  the  use  of  different 
variance  estimators  of  each  regression  coefficient  alters  the  coverage  probabilities  in  the 
absence  and  presence  of  heteroskedasticity.  When  the  disturbances  are  homoskedastic,  in 
Figure  I,  the  robust  variance  estimators  cover  fa  quite  nicely.  Even  though  they  are  not 
as  good  as  OLS,  the  cost  of  using  them  does  not  seem  too  high,  except  White's  estimator 
whose  coverage  is  the  farthest  away  from  95%.  When  the  disturbances  are  indeed  het- 
eroskedastic,  in  Figure  II,  all  the  robust  procedures  perform  much  better  than  the  OLS 
which  now  cover  fa  far  bellow  95%  of  the  time.  Interestingly,  even  though  the  truncated 
MINQUE  has  large  MSE,  it  performs  reasonably  well  in  terms  of  coverage  probabilities. 
The  performance  of  MWE  also  appears  to  be  good  for  this  case. 

The  behavior  of  the  variance  estimators  for  fa ,  in  Figures  III  and  IV,  is  much  differ- 
ent. In  both  homoskedastic  and  heteroskedastic  cases,  all  the  robust  estimators  perform 
very  poorly.  In  the  heteroskedastic  case,  even  though  there  is  slight  improvement  in  the 
performance  of  the  robust  estimators  and  some  deterioration  in  the  performance  of  OLS, 
the  robust  estimators  are  still  worse  than  the  OLS.  This  illustration  shows  once  again 
that  the  use  of  the  robust  procedures,  especially  White's,  can  lead  to  a  serious  inferential 
problem  when  the  design  matrix  exhibits  high  points  of  leverage. 

6.  Conclusion 


In  this  paper  we  have  reconsidered  MINQUE  as  an  alternative  procedure  for  esti- 
mating variance-covariance  matrix  in  a  general  heteroskedastic  model.  We  showed  that 
the  problem  can  be  approached  very  easily  within  the  framework  of  variance  components 
models,  where  the  heteroskedastic  model  is  a  special  case.  By  construction,  MINQUE 
incorporates  all  the  information  on  the  design  matrix  in  order  to  eliminate  the  bias  which 
is  known  to  be  the  leading  cause  of  inferential  problems  of  White's  estimator.  Our  Monte 
Carlo  study,  however,  indicated  that  high  points  of  leverage  in  the  design  matrix  can 
lead  to  negative  estimates  of  MINQUE  and  dramatically  increase  its  variance.  We  also 
considered  an  ad  hoc  truncation  of  MINQUE  by  replacing  the  negative  estimates  by  some 
small  positive  values.  In  terms  of  coverage  probabilities,  this  truncated  MINQUE  performs 
reasonably  well  compared  to  the  other  robust  procedures.   It  is,  however,  quite  desirable 
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to  truncate  those  negative  estimates  in  a  more  systematic  way.  Overall,  based  on  the 
previous  two  simulation  studies  of  MacKinnon  and  White  (1985)  and  Nanayakkara  and 
Cressie  (1991)  and  our  results  on  the  bias,  variance  and  coverage  probabilities,  the  simple 
almost  unbiased  estimator  suggested  by  MacKinnon  and  White  seems  to  be  preferable  for 
practical  purposes. 

MINQUE  is  developed  within  the  framework  of  variance  components  model.  This 
framework  is  very  rich  and  it  encompasses  many  econometric  models.  One  interesting 
extension  of  our  approach  is  the  estimation  of  the  autoregressive  conditional  heteroskedas- 
ticity  (ARCH)  models.  Suprayitno  (1992)  applied  MINQUE  for  various  ARCH  models 
to  provide  an  alternative  method  to  the  maximum  likelihood  estimation  procedure.  This 
approach  might  be  promising  since  MINQUE  does  not  require  the  explicit  distributional 
assumption.  Also  in  this  case,  MINQUE  might  have  better  finite  sample  properties  since 
here  we  need  to  estimate  onlv  a  few  ARCH  coefficients. 
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Table  I 
Biases,  Variances  and  Mean  Square  Errors  for  Homoskedastic  Case 


Sample    ^     c 
Size       Coef- 


OLS 


Estimation  Procedure  * 


White 


MWE 


MINQUE         MINQUE1 


BiasxlO 

30 

00 

0.00 

;0.43) 

-5.97 

(-5.87) 

0.00 

(0.05) 

0.00 

(-0.06) 

(3.26) 

01 

0.00 

;o.o5) 

-1.95 

(-2.00) 

0.00 

(-0.15) 

0.00 

(-0.45) 

(1.90) 

02 

0.00 

,0.10) 

-1.40 

(-1.34) 

0.00 

(0.08) 

0.00 

(0.10) 

(0.66) 

Variancex  100 

30 

00 

118.93    (1 

5.75) 

213.35 

(214.01) 

301.50 

(300.33) 

347.81 

(341.01) 

(366.86) 

01 

1.83 

;i.78) 

3.55 

(3.04) 

18.33 

(15.38) 

107.00 

(90.52) 

(70.23) 

02 

6.77 

,6.59) 

13.36 

(13.89) 

19.34 

(20.09) 

23.17 

(24.08) 

(24.70) 

Mean  Square  Error x 

100 

30 

00 

118.93  (11 

15.94) 

248.96  (248.50) 

301.50 

(300.33) 

347.81 

(341.01) 

(377.50) 

01 

1.83 

;i.78) 

7.36 

(7.06) 

18.33 

(15.40) 

107.00 

(90.72) 

(73.87) 

02 

6.77 

;6.60) 

15.32 

(15.67) 

19.34 

(20.10) 

23.17 

(24.09) 

(25.15) 

BiasxlO 

60 

00 

0.00 

,0.02) 

-1.54 

(-1.53) 

0.00 

(0.01) 

0.00 

(0.00) 

(0.39) 

01 

0.00 

;o.oo) 

-0.29 

(-0.26) 

0.00 

(0.04) 

0.00 

(0.04) 

(0.15) 

02 

0.00 

,0.00) 

-0.34 

(-0.34) 

0.00 

(0.01) 

0.00 

(0.02) 

(0.08) 

Variancex  100 

60 

00 

20.20      (1 

18.86) 

47.84 

(46.54) 

57.58 

(55.51) 

63.74 

(60.74) 

(60.96) 

01 

0.20 

[0.18) 

0.72 

(0.74) 

1.07 

(1.10) 

1.34 

(1.35) 

(1.40) 

02 

0.76 

,0.71) 

1.55 

(1.53) 

1.88 

(1.86) 

2.09 

(2.05) 

(2.06) 

Mean  Square  Error x 

100 

60 

00 

20.20      (] 

L8.86) 

50.22 

(48.88) 

57.58 

(55.51) 

63.74 

(60.74) 

(61.12) 

01 

0.20 

,0.18) 

0.80 

(0.81) 

1.07 

(1.10) 

1.34 

(1.35) 

(1.43) 

02 

0.76 

,0.71) 

1.66 

(1.66) 

1.88 

(1.86) 

2.09 

(2.05) 

(2.06) 

BiasxlO 

100 

00 

0.00     ( 

-0.02) 

-0.62 

(-0.65) 

0.00 

(-0.04) 

0.00 

(-0.05) 

(0.13) 

01 

0.00 

,0.00) 

-0.09 

(-0.10) 

0.00 

(-0.02) 

0.00 

(-0.02) 

(0.01) 

02 

0.00 

,0.00) 

-0.11 

(-0.11) 

0.00 

(0.00) 

0.00 

(0.00) 

(0.02) 

Variancex  100 

100 

0o 

2.96 

[2.98) 

6.63 

(6.35) 

7.78 

(7.36) 

8.95 

(8.36) 

(8.26) 

01 

0.01 

;o.oi) 

0.08 

(0.07) 

0.11 

(0.11) 

0.17 

(0.15) 

(0.15) 

02 

0.10 

;o.io) 

0.26 

(0.26) 

0.30 

(0.29) 

0.31 

(0.31) 

(0.31) 

Mean  Square  Error  x 

100 

100 

0o 

2.96 

[2.98) 

7.01 

(6.78) 

7.78 

(7.36) 

8.95 

(8.36) 

(8.26) 

01 

0.01 

[0.01) 

0.08 

(0.08) 

0.11 

(0.11) 

0.17 

(0.15) 

(0.15) 

02 

0.10 

(0.10) 

0.28 

(0.27) 

0.30 

(0.29) 

0.31 

(0.31) 

(0.31) 

T  Numbers  in  parentheses  were  calculated  from  sampling  experiment. 
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Table  II 

Biases,  Variances  and  Mean  Square  Errors  for  Heteroskedasticity  of  Model  1 

(Sample  Size=60,  Max(/»i,-)=.266,  Min(/iit)=.017) 


Estimation  Procedure  * 

max(<7? ) 

Coef. 

mm((7|') 

OLS 

White 

MWE 

MINQUE 

MINQUE1 

BiasxlO 

3.25 

A 

7.33 

(7.42) 

-1.97 

(-1.85) 

0.17      (0.30) 

0.00 

(0.13) 

(0.74) 

A 

0.55 

(0.56) 

-0.32 

(-0.25) 

0.07      (0.16) 

0.00 

(-0.09) 

(0.28) 

A 

-0.72 

(-0.71) 

-0.61 

(-0.60) 

-0.01      (0.00) 

0.00 

(0.01) 

(0.17) 

Variancex  100 

3.25 

A 

62.75 

(58.43) 

74.88 

(75.78) 

89.62     (90.39) 

97.39 

(97.56) 

(98.09) 

A 

0.61 

(0.57) 

1.44 

(1.63) 

2.04      (2.30) 

2.39 

(2.65) 

(2.81) 

A 

2.37 

(2.21) 

6.48 

(6.71) 

7.70      (8.00) 

8.33 

(8.67) 

(8.78) 

Mi 

ean  Square  Error> 

:100 

3.25 

A 

116.45 

(113.44) 

78.77 

(79.19) 

89.65    (90.48) 

97.39 

(97.57) 

(98.64) 

A 

0.91 

(0.88) 

1.55 

(1.70) 

2.05      (2.32) 

2.39 

(2.66) 

(2.89) 

A 

2.90 

(2.71) 

6.85 

(7.07) 

7.70      (8.00) 

8.33 

(8.67) 

(8.81) 

BiasxlO 

5.42 

A 

14.66 

(14.80) 

-2.40 

(-2.16) 

0.34      (0.59) 

0.00 

(0.25) 

(1.10) 

A 

1.11 

(1.12) 

-0.36 

(-0.25) 

0.15      (0.27) 

0.00 

(0.13) 

(0.41) 

A 

-1.45 

(-1.42) 

-0.88 

(-0.85) 

-0.02    (-0.01) 

0.00 

(0.03) 

(0.26) 

Variancex  100 

5.42 

A 

135.49 

(126.59) 

118.27  ( 

123.24) 

141.11    (147.10) 

151.14 

(157.17) 

(159.25) 

A 

1.32 

(1.22) 

2.51 

(2.96) 

3.46      (4.07) 

3.89 

(4.52) 

(4.84) 

A 

5.12 

(4.78) 

16.02 

(16.65) 

18.98    (19.79) 

20.44 

(21.38) 

(21.75) 

M. 

ean  Square  Error> 

:100 

5.42 

A 

350.38  (345.80) 

124.03  ( 

127.92) 

141.22   (147.45) 

151.14 

(157.23) 

(160.46) 

A 

2.54 

(2.49) 

2.64 

(3.02) 

3.49      (4.15) 

3.89 

(4.54) 

(5.01) 

ft 

7.21 

(6.80) 

16.79 

(17.37) 

18.98    (19.79) 

20.44 

(21.38) 

(21.82) 

BiasxlO 

7.50 

ft 

21.99 

(22.19) 

-2.83 

(-2.48) 

0.50      (0.88) 

0.00 

(0.38) 

(1.47) 

A 

1.66 

(1.68) 

-0.39 

(-0.24) 

0.22      (0.39) 

0.00 

(0.18) 

(0.54) 

ft 

-2.17 

(-2.13) 

-1.15 

(-1.10) 

-0.03     (-0.02) 

0.00 

(0.05) 

(0.36) 

Variancex  100 

7.50 

ft 

238.45 

(223.30) 

178.02(188.83) 

212.07   (225.53) 

225.05 

(239.43) 

(243.52) 

A 

2.32 

(2.17) 

3.93 

(4.73) 

5.33      (6.42) 

5.84 

(6.98) 

(7.51) 

ft 

9.01 

(8.44) 

30.19 

(31.33) 

35.72    (37.20) 

38.43 

(40.16) 

(40.91) 

M< 

ean  Square  Error* 

:100 

7.50 

ft 

722.04 

(715.78) 

186.02(194.98) 

212.33   (226.32) 

225.05 

(239.58) 

(245.69) 

A 

5.08 

(5.00) 

4.08 

(4.79) 

5.38      (6.57) 

5.84 

(7.01) 

(7.81) 

ft 

13.72 

(12.98) 

31.50 

(32.54) 

35.72    (37.20) 

38.43 

(40.17) 

(41.04) 

•  Numbers  in  parentheses  were  calculated  from  sampling  experiment. 
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Table  III 

Biases,  Variances  and  Mean  Square  Errors  for  Heteroskedasticity  of  Model  1 

(Sample  Size=100,  Max(/»i,-)=.274,  Min(/i;i)=.011) 


Estimation  Procedure  * 

max(<7f) 

Coef. 

min(<r-) 

OLS 

White 

MWE 

MINQUE 

MINQUE1 

BiasxlO 

4.42 

00 

1.25 

(1.25) 

-1.69 

(-1.72) 

-0.30     (-0.36) 

0.00 

(-0.08) 

(0.23) 

lh 

-0.23 

(-0.23) 

-0.28 

(-0.31) 

-0.06      (0.10) 

0.00 

(-0.05) 

(0.01) 

02 

-1.04 

(-1.04) 

-0.33 

(-0.32) 

-0.05     (-0.04) 

0.00 

(0.01) 

(0.06) 

Variancex  100 

4.42 

00 

11.71 

(11.92) 

24.61 

(22.67) 

32.19     (28.89) 

41.56 

(36.25) 

(36.00) 

01 

0.05 

(0.05) 

0.45 

(0.38) 

0.78      (0.65) 

1.26 

(1.04) 

(1.03) 

02 

0.41 

(0.41) 

2.43 

(2.35) 

2.80      (2.71) 

3.03 

(2.92) 

(2.95) 

M« 

?an  Square  Error x 

100 

4.42 

00 

13.28 

(13.48) 

27.46 

(25.64) 

32.28    (29.02) 

41.56 

(36.25) 

(36.06) 

01 

0.11 

(0.11) 

0.53 

(0.48) 

0.78      (0.66) 

1.26 

(1.04) 

(1.03) 

02 

1.49 

(1.50) 

2.53 

(2.45) 

2.80      (2.71) 

3.03 

(2.92) 

(2.95) 

BiasxlO 

7.71 

00 

2.49 

(2.52) 

-2.75 

(-2.79) 

-0.60     (-0.67) 

0.00 

(-0.11) 

(0.35) 

01 

-0.46 

(-0.46) 

-0.47 

(-0.52) 

-0.13      (0.19) 

0.00 

(-0.08) 

(0.01) 

02 

-2.08 

(-2.07) 

-0.54 

(-0.52) 

-0.09    (-0.08) 

0.00 

(0.02) 

(0.09) 

Variancex  100 

7.71 

00 

27.80 

(28.32) 

61.47 

(55.75) 

82.09     (72.49) 

107.84 

(92.42) 

(92.19) 

01 

0.13 

(0.13) 

1.18 

(0.97) 

2.06      (1.68) 

3.39 

(2.75) 

(2.73) 

02 

0.97 

(0.99) 

7.11 

(6.87) 

8.23      (7.93) 

8.93 

(8.58) 

(8.67) 

M« 

^an  Square  Error x 

100 

7.71 

00 

34.02 

(34.66) 

69.05 

(63.55) 

82.45    (72.95) 

107.84 

(92.43) 

(92.31) 

01 

0.34 

(0.34) 

1.40 

(1.24) 

2.08      (1.72) 

3.39 

(2.76) 

(2.73) 

02 

5.29 

(5.29) 

7.41 

(7.14) 

8.24      (7.93) 

8.93 

(8.58) 

(8.68) 

BiasxlO 

10.87 

00 

3.74 

(3.79) 

-3.82 

(-3.86) 

-0.90     (-0.99) 

0.00 

(-0.14) 

(0.47) 

01 

-0.70 

(-0.69) 

-0.65 

(-0.73) 

-0.19     (-0.27) 

0.00 

(-0.10) 

(0.01) 

02 

-3.12 

(-3.11) 

-0.76 

(-0.73) 

-0.14     (-0.11) 

0.00 

(0.02) 

(0.13) 

Variancex  100 

10.87 

00 

51.25 

(52.19) 

117.21 

(105.62) 

157.49  (138.18) 

207.83  (176.90) 

(177.11) 

01 

0.23 

(0.24) 

2.25 

(1.84) 

3.96      (3.21) 

6.56 

(5.29) 

(5.25) 

02 

1.78 

(1.82) 

14.32 

(13.81) 

16.59    (15.95) 

18.01 

(17.28) 

(17.49) 

M 

san  Square  Error x 

100 

10.87 

00 

65.21 

(66.54) 

131.83  (120.52)  - 

158.31  (139.16) 

207.83  (176.92) 

(177.33) 

01 

0.72 

(0.72) 

2.68 

(2.36) 

4.00      (3.29) 

6.56 

(5.30) 

(5.25) 

02 

11.51 

(11.48) 

14.90 

(14.35) 

16.61    (15.96) 

18.02 

(17.28) 

(17.49) 

T  Numbers  in  parentheses  were  calculated  from  sampling  experiment. 

19 


Table  IV 
Relative  Bias   ' 
for  Homoskedastic  Case  and  Heteroskedasticity  of  Model  1 


S^Ple     gg*(*J}    Coef.  OLS  White  MWE       MINQUE     MINQUE1 

0o 

40  1.00  3l 


a 


i 


90  10.82 


0.472 

-11.338 

0.270 

0.202 

7.154 

0.000 

-47.468 

-2.110 

-3.165 

79.114 

0.000 

-9.958 

1.062 

1.195 

4.249 

A 

26.658 

-10.647 

1.099 

0.523 

9.104 

40 

3.00 

&i 

1.971 

-47.306 

-0.394 

-2.628 

80.815 

02 

-2.612 

-9.813 

-0.158 

1.662 

4.669 

00 

43.554 

-10.215 

1.817 

0.641 

10.344 

40 

4.94 

01 

2.869 

-47.346 

-0.478 

-2.869 

81.779 

02 

-3.721 

-9.755 

-0.226 

1.804 

4.849 

00 

55.239 

-9.899 

2.312 

0.723 

11.182 

40 

6.82 

3, 

3.383 

-47.368 

-0.376 

-3.008 

81.842 

02 

-4.335 

-9.678 

-0.307 

1.883 

4.905 

00 

-0.0S5 

-5.527 

-0.680 

-0.595 

3.486 

90 

1.00 

01 

0.000 

-56.548 

0.000 

5.952 

107.143 

02 

-0.390 

-4.677 

-0.779 

-0.779 

0.390 

00 

8.539 

-7.210 

-0.716 

-0.102 

4.500 

90 

4.42 

3, 

-25.735 

-63.725 

-15.931 

6.127 

84.559 

02 

-21.647 

-6.088 

-0.676 

0.507 

0.676 

00 

12.252 

-7.936 

-0.987 

0.146 

5.120 

90 

7.69 

3, 

-32.483 

-64.965 

-20.108 

6.961 

77.340 

02 

-27.652 

-6.481 

-0.972 

-0.324 

0.864 

00 

14.262 

-S.341 

-1.167 

0.256 

5.323 

3, 

-35.008 

-66.064 

-22.021 

6.776 

76.228 

02 

-30.546 

-6.665 

-1.031 

-0.238 

1.031 

1  The  figures  below  each  procedure  are  the  percentage  of  the  ratio  between  the  bias  and  the  true 
variance.  For  homoskedastic  case,  the  numbers  below  OLS  and  MWE  were  calculated  from  sampling 
experiment.  All  the  numbers  below  MINQUE  were  calculated  from  sampling  experiment. 
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Table  V 

Relative  Bias  * 

for  Homoskedastic  Case  and  Heteroskedasticity  of  Model  1 


Ss£ele      minffi)    Coef-  0LS  White  MWE       MINQUE     MINQUE1 

00 

60  1.00         0X 

02 


100  10.90 


See  note  on  Table  IV. 


0.083 

-6.419 

0.0417 

0.000 

1.626 

0.000 

-12.262 

1.691 

1.691 

6.342 

0.000 

-7.290 

0.214 

0.429 

1.715 

0o 

22.249 

-5.979 

0.516 

0.395 

2.246 

60 

3.25 

0i 

16.091 

-9.362 

2.048 

-2.633 

8.192 

02 

-8.418 

-7.132 

-0.117 

0.117 

1.988 

0o 

34.996 

-5.729 

0.812 

0.597 

2.626 

60 

5.42 

01 

24.849 

-8.059 

3.358 

2.910 

9.178 

02 

-11.656 

-7.074 

-0.161 

0.241 

2.090 

00 

43.258 

-5.567 

0.984 

0.748 

2.892 

60 

7.50 

01 

30.094 

-7.070 

3.988 

3.263 

9.790 

02 

-13.291 

-7.044 

-0.1S4 

0.306 

2.205 

00 

-0.167 

-5.174 

-0.334 

-0.417 

1.085 

100 

1.00 

01 

0.000 

-11.097 

-2.466 

-2.466 

1.233 

$2 

0.000 

-4.919 

0.000 

0.000 

0.894 

00 

5.961 

-8.059 

-1.431 

-0.381 

1.097 

100 

4.42 

01 

-13.249 

-16.129 

-3.456 

-2.880 

0.576 

02 

-20.062 

-6.366 

-0.965 

0.193 

1.157 

0o 

8.315 

-9.1S4 

-2.004 

-0.367 

1.169 

100 

7.71 

01 

-17.300 

-17.676 

-4.889 

-3.009 

0.376 

02 

-25.584 

-6.642 

-1.107 

0.246 

1.107 

00 

9.609 

-9.815 

-2.312 

-0.360 

1.208 

0i 

-19.542 

-18.146 

-5.304 

-2.792 

0.279 

02 

-2S.169 

-6.862 

-1.264 

0.181 

1.174 
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Table  VI 
True  Variances  of  J3  ' 


SS?zele    Coef-     H(°'0)  H(U)       H(1'2)       H^'3)       H(2'X)        H(2'2)      H(2'3) 


30 


A 

4.0068 

5.4141 

6.8189 

8.2237 

8.0799 

12.1552 

16.2304 

A 

0.4971 

0.7170 

0.9360 

1.1550 

0.6692 

0.8415 

1.0137 

A 

0.9563 

1.6766 

2.3964 

3.1162 

1.8673 

2.7788 

3.6903 

A 

2.9634 

3.8225 

4.6792 

5.5359 

6.1454 

9.3290 

12.5126 

40 

A 

0.0948 

0.1522 

0.2091 

0.2660 

0.1027 

0.1105 

0.1184 

& 

0.7532 

1.2636 

1.7735 

2.2835 

1.5338 

2.3146 

3.0954 

A 

2.5372 

3.3862 

4.2340 

5.0817 

4.1211 

5.7060 

7.2909 

50 

A 

0.0833 

0.1229 

0.1623 

0.2017 

0.0950 

0.1067 

0.1183 

A 

0.6340 

1.0885 

1.5425 

1.9965 

0.9636 

1.2934 

1.6231 

A 

2.3991 

3.2946 

4.1890 

5.0835 

4.5405 

6.6830 

8.8255 

60 

A 

0.2365 

0.3418 

0.4467 

0.5516 

0.3679 

0.4993 

0.6308 

A 

0.4664 

0.8553 

1.2440 

1.6327 

0.8938 

1.3212 

1.7487 

A 

1.5109 

2.4239 

3.3359 

4.2478 

2.9763 

4.4422 

5.9080 

70 

A 

0.0756 

0.1297 

0.1834 

0.2372 

0.1010 

0.1264 

0.1519 

A 

0.3416 

0.7395 

1.1371 

1.5347 

0.6914 

1.0413 

1.3911 

A 

1.4829 

2.3171 

3.1502 

3.9834 

2.8315 

4.1807 

5.5298 

80 

A 

0.1411 

0.2527 

0.3640 

0.4753 

0.1745 

0.2079 

0.2414 

A 

0.2684 

0.6096 

0.9507 

1.2918 

0.6030 

0.9376 

1.2723 

A 

1.1761 

1.9557 

2.7343 

3.5129 

2.3632 

3.5507 

4.7382 

90 

A 

0.0336 

0.0816 

0.1293 

0.1771 

0.0522 

0.0708 

0.0894 

A 

0.2566 

0.5913 

0.925S 

1.2604 

0.5S35 

0.9103 

1.2372 

A 

1.1984 

2.0970 

2.9945 

3.8920 

2.2594 

3.3205 

4.3816 

100 

A 

0.0811 

0.1736 

0.2659 

0.3582 

0.1267 

0.1726 

0.2184 

A 

0.2236 

0.5184 

0.8130 

1.1076 

0.4735 

0.7234 

0.9733 

'   Numbers  under  H(i,j)  are  the  true  variances  in  the  heteroskedasticity  of  Model  i  level  j;  H(0,0)  denotes 
homoskedastic  case. 
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Figure  I:  Homoskedastic  Case 
Estimated  95%  C.I.  Coverage  Prob.  of  B2 


1  00- 

098- 

096- 

D...                                                        

C.l.  Covearge  Probability 
p         p         p         p 
i            i            r           i 

^^^~       in,                           |  ,    J3r, j-, 

• -€3 

'y      *'^^- 

""""■■"■•oowisooo 

0  86- 

084- 

i                        i                        i                        i                        i                        i                        i                        i 
30                    40                    50                     60                     70                    80                     90                   100 

Sample  Size 

TRUE            a     OLS            WHITE 

MINQUE     ----  MINQUE1  MWE 

Figure  II:  Heteroskedastic  Case 
Estimated  95%  C.I.  Coverage  Prob.  of  B2 
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Figure  III:  Homoskedastic  Case 
Estimated  95%  C.I.  Coverage  Prob.  of  B1 
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Figure  IV:  Heteroskedastic  Case 
Estimated  95%  C.I.  Coverage  Prob.  of  B1 
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Appendix 

We  derive  the  variance-covariance  matrix  V(e)  of  vector  of  the  OLS  residuals  squared  e'  =  (i^, . . .  ,  £n). 
We  assume  the  disturbance  process  e  is  normally  distributed  with  mean  0  and  variance  S  =  diag(crj, . . . ,  a^), 
and  define  M  =  [m^-]  =  I  -  X(X'X)~lX'.  Then  £=  Me  is  also  normally  distributed  with  mean  0  and 
variance-covariance  matrix  T  =  MT,M .  Explicitly, 


r  = 


En  oo 

l=lmUfft 
En  o 

l=l  mum2tai 


En  •>      2 

*=1  ^l^l 


Yl7=irnUrnnt^t     5Z?=i  mitmnttf 


Ei  1 

t_1  raiLmnLai 


En  11 


=  hijh    say- 


(Al) 


Variance-covariance  matrix  V(e)  can  be  derived  following  the  standard  procedure.  Define  a  real  valued 
vector  t'  —  {i\ , . . . ,  tn)  and  let  A  =  y/—T,  then  the  characteristic  function  of  t  =  (i\, . . . , in)  is  given  by 


G(t] 


/co  poo 

■■•/      exp(At'£)dF, 
•co         J  —  co 


(A2) 


where  dF  is  the  multivariate  normal  density  function  of  z.  It  can  be  shown  that  the  integral  in  (A2)  factorizes 
into  (n  —  k)  single  integrals,  each  of  which  is  bounded  above  for  all  real  values  of  ii  (i  =  1, . . . ,  n).  Thus 
(A2)  is  bounded,  and  therefore  we  can  differentiate  it  under  the  integral  signs.  Taking  partial  derivatives  of 
(A2)  with  respect  to  (A*;,  Xtj),  r  and  s  times  respectively,  and  putting  t\  —  %i  —  •  •  •  =  tn  =  0,  we  have 


dr»G(tu...,tn) 


t  = 


/CO  rOO 

■■■/      i^dF^E^i]). 
-co         J  —oo 


d(\tiy  diXtj)1 
Evaluation  of  the  right  hand  side  of  (A2)  leads  to  [Anderson  (1984,  pp.  45-46)] 


G(*1,...,*n)  =  exp(-At,rAt) 


(A3) 


ri        n 


(A4) 


i=ij=i 


and  the  following  immediately  follows  from  (A3): 


£(£\2)  =  7u-  »=  1,  •■-,»», 

E{i?)  =  37,2,,         i  =  l,...,n, 
E{i1  i] )  =  2-ffj  +  yufjj ,         i,  j  =  1, . . . ,  n;     i  £  j. 

Consequently,  the  variances  of  e  are  given  by 

var(e?)  =  E{£})  -  (E(e;)f  =  27=  =  2^  ™Wt 
for  i  =  1, ... ,  n,  and  the  covariances, 

cov(f?,e?)  =  E{ijq)  -  E{i])  E(i])  =  2tJ  =  :  2(  ^m^m^c? 
for  i,  j  =  l,...,n;i^  j. 


*=i 


(A5) 


*=i 


(A6) 
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